Defeating line-noise CAPTCHAs with multiple quadratic snakes
نویسندگان
چکیده
Optical character recognition (OCR) is one of the fundamental problems in artificial intelligence and image processing, but recent progress in OCR represents a security challenge for Web sites that throttle requests with image based CAPTCHAs (Completely Automated Public Turing Tests to Tell Computers and Humans Apart). A CAPTCHA is challengeresponse test placed within web forms to determine whether the user is human. Unfortunately, algorithms capable of solving image based CAPTCHAs can be used to create spam accounts and design malicious denial of service (DoS) attacks, causing financial and social damage. The problem of defeating digital image CAPTCHAs is thus twofold. On the one hand, it is an important problem in artificial intelligence and image processing. On the other hand, publicly available CAPTCHAs that are not tested against state of the art machine recognition algorithms may make the systems vulnerable to attack by software bots. This paper considers a very important subclass of text CAPTCHAs, those characterized by salt and pepper noise combined with line (curve) noise. Thus far, attacks on CAPTCHAs with this type of noise have used relatively simple image processing methods with some success, but state-of-the-art segmentation methods have not been fully exploited. In this paper, we propose and benchmark two strong segmentation methods. The first method is a modification of a multiple quadratic snake proposed for road extraction from satellite images. The second competing method is a boundary tracing routine available in the OpenCV open source library. A first numerical experiment indicates excellent accuracy for both methods. A second experiment on human recognition shows that the CAPTCHAs used in the study are already near the threshold of being too hard for humans. Finally, a third numerical experiment presents a more difficult set of CAPTCHAs with the addition of anti-binarization methods. The snake-based method is shown to be more resilient to anti-binarization schemes than boundary tracing and state-of-the art projection-based attacks on CAPTCHAs. Since CAPTCHAs corrupted by small line noise are shown to be difficult for humans and relatively easy for our algorithm, CAPTCHA designers should introduce more challenging distortions into their CAPTCHAs, lest the security of systems based on them be compromised. a 2013 Elsevier Ltd. All rights reserved. ional Institute of Technology, Thammasat University, Pathum Thani 12000, Thailand. Tel.: , [email protected] (S.S. Makhanov). ier Ltd. All rights reserved. c om p u t e r s & s e c u r i t y 3 7 ( 2 0 1 3 ) 9 1e1 1 0 92
منابع مشابه
A Projection-based Segmentation Algorithm for Breaking MSN and YAHOO CAPTCHAs
Defeating a CAPTCHA test requires two procedures: segmentation and recognition. Recent research shows that the problem of segmentation is much harder than recognition. In this paper, a new projection-based segmentation algorithm is proposed for the MSN and Yahoo CAPTCHAs. Experimental results show that the proposed algorithm can improve correct segmentation rates ranging from 9% to 14% over the...
متن کاملNumerical experiments with cooperating multiple quadratic snakes for road extraction
Higher-order active contours or snakes show much promise for extraction of complex objects from noisy imagery. These models provide an elegant mathematical framework for specifying the desired properties of target objects via energy functionals that can be minimized with standard optimization techniques. However, techniques to allow quadratic snakes to change topology during segmentation have n...
متن کاملHIPUU: a Universally Usable Approach to Defeating Automated Bots
There is clearly a need on the web for security features to stop spam and bots. However, many security features on web sites are not accessible for people with disabilities. A common form of security on web sites is a human interaction proof. HIPs are used to differentiate between humans and automated bots. The most common form of HIP is known as a CAPTCHA (Completely Automated Public Turing te...
متن کاملBreaking Audio CAPTCHAs
CAPTCHAs are computer-generated tests that humans can pass but current computer systems cannot. CAPTCHAs provide a method for automatically distinguishing a human from a computer program, and therefore can protect Web services from abuse by so-called “bots.” Most CAPTCHAs consist of distorted images, usually text, for which a user must provide some description. Unfortunately, visual CAPTCHAs li...
متن کاملStudying CAPTCHA Complexity with Eye Tracking
A CAPTCHA is a widely used security mechanism to prevent automated access to webpages by malicious users such as internet bots. Introducing elements such as distortion, distracting backgrounds and noise are some ways to make it harder for computer vision algorithms to break them. However, making CAPTCHAs complex by increasing the intensity of these properties has a direct adverse impact on thei...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computers & Security
دوره 37 شماره
صفحات -
تاریخ انتشار 2013